introduction to r markdown

Anna Krystalli & Mike Croucher

3 August 2016

intro

markdown .md

stripped down html

  • intended to be as easy-to-read and easy-to-write as possible.
  • intended for one purpose: to be used as a format for writing for the web.
  • syntax is very small, corresponding only to a very small subset of HTML tags.

focus on communicating & disseminating

  • formatting handled automatically
  • clean and legible across platforms and outputs

rmarkdown .Rmd

literate programming

single document to integrate data analysis with textual representations, linking data, code, and text are not linked

– A documentantion language – A programming language

rmarkdown integrates R with md

outputs

  • html
  • presentations
  • pdf
  • word documents

it’s everywhere!

Rmarkdown & reproducibilty

computation & science

Computational science has led to exciting new developments:

  • Technology is increasing data collection throughput; data are more complex and highdimensional
  • Existing databases can be merged to become bigger databases
  • Computing power allows more sophisticated analyses, even on “small” data
  • For every field “X” there is a “Computational X”

Increasing computational complexity of analyses:

the nature of the work has exposed limitations in our ability to evaluate published findings.

- Even basic analyses difficult to describe

  • Heavy computational requirements thrust upon people without adequate training in statistics and computing

  • Errors more easily introduced into long analysis pipelines

  • Knowledge transfer is inhibited

  • Results are difficult to replicate or reproduce

  • Complicated analyses cannot be trusted

calls for reproducibility


Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.

  • fully scripted analyses
  • make code and data available

reproducibility limitations

  • top down
  • downstream

evidence based

evdence needs:

  • documenting
  • linking
  • communicating

rmarkdown is the glue that knits the tools, processes and output into evidence trails

at all stages of science

simple tools: low hanging fruit

  • begin at the start of the process
  • document & interlink evidence stream
  • explore and communicate!

empower your code and data

examples

report

code documentation

method collation

interactive documents

presentations

md basics

text

    normal text

normal text

    *italic text*

italic text

    **bold text**

bold text

    **bold italic text**

bold italic text

    superscript^2^

superscript2

    ~~strikethrough~~

strikethrough

headers

unordered lists

ordered lists

quotes & code

> this text will be quoted

this text will be quoted

`this text will appear as code` inline

this text will appear as code inline

a <- 10
    the value of parameter *a* is `r a`

the value of parameter a is 10

text formatting

images

    ![](https://www.rstudio.com/wp-content/uploads/2015/01/rmarkdown-cheatsheet-2-e1457627578814.png)
    
    ![](resources/cheat.png)
    

resize images

    <img src="resources/cheat.png" width="200px" />

basic tables

Table Header  | Second Header
------------- | -------------
Cell 1        | Cell 2
Cell 3        | Cell 4 
Table Header Second Header
Cell 1 Cell 2
Cell 3 Cell 4

online table to .md converter

.md resources

offical markdown documentation

gtihub .md basics

stackoverflow .md basics

github.io websites: eg Andy South’s blog

chunks

R code chunks can be used as a means render R output into documents or to simply display code for illustration

options

for more details see http://yihui.name/knitr/

extras

knitr::kable() tables

require(knitr)
data(airquality)
kable(airquality, caption = "New York Air Quality Measurements")
New York Air Quality Measurements
Ozone Solar.R Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8.0 72 5 2
12 149 12.6 74 5 3
18 313 11.5 62 5 4
NA NA 14.3 56 5 5
28 NA 14.9 66 5 6
23 299 8.6 65 5 7
19 99 13.8 59 5 8
8 19 20.1 61 5 9
NA 194 8.6 69 5 10
7 NA 6.9 74 5 11
16 256 9.7 69 5 12
11 290 9.2 66 5 13
14 274 10.9 68 5 14
18 65 13.2 58 5 15
14 334 11.5 64 5 16
34 307 12.0 66 5 17
6 78 18.4 57 5 18
30 322 11.5 68 5 19
11 44 9.7 62 5 20
1 8 9.7 59 5 21
11 320 16.6 73 5 22
4 25 9.7 61 5 23
32 92 12.0 61 5 24
NA 66 16.6 57 5 25
NA 266 14.9 58 5 26
NA NA 8.0 57 5 27
23 13 12.0 67 5 28
45 252 14.9 81 5 29
115 223 5.7 79 5 30
37 279 7.4 76 5 31
NA 286 8.6 78 6 1
NA 287 9.7 74 6 2
NA 242 16.1 67 6 3
NA 186 9.2 84 6 4
NA 220 8.6 85 6 5
NA 264 14.3 79 6 6
29 127 9.7 82 6 7
NA 273 6.9 87 6 8
71 291 13.8 90 6 9
39 323 11.5 87 6 10
NA 259 10.9 93 6 11
NA 250 9.2 92 6 12
23 148 8.0 82 6 13
NA 332 13.8 80 6 14
NA 322 11.5 79 6 15
21 191 14.9 77 6 16
37 284 20.7 72 6 17
20 37 9.2 65 6 18
12 120 11.5 73 6 19
13 137 10.3 76 6 20
NA 150 6.3 77 6 21
NA 59 1.7 76 6 22
NA 91 4.6 76 6 23
NA 250 6.3 76 6 24
NA 135 8.0 75 6 25
NA 127 8.0 78 6 26
NA 47 10.3 73 6 27
NA 98 11.5 80 6 28
NA 31 14.9 77 6 29
NA 138 8.0 83 6 30
135 269 4.1 84 7 1
49 248 9.2 85 7 2
32 236 9.2 81 7 3
NA 101 10.9 84 7 4
64 175 4.6 83 7 5
40 314 10.9 83 7 6
77 276 5.1 88 7 7
97 267 6.3 92 7 8
97 272 5.7 92 7 9
85 175 7.4 89 7 10
NA 139 8.6 82 7 11
10 264 14.3 73 7 12
27 175 14.9 81 7 13
NA 291 14.9 91 7 14
7 48 14.3 80 7 15
48 260 6.9 81 7 16
35 274 10.3 82 7 17
61 285 6.3 84 7 18
79 187 5.1 87 7 19
63 220 11.5 85 7 20
16 7 6.9 74 7 21
NA 258 9.7 81 7 22
NA 295 11.5 82 7 23
80 294 8.6 86 7 24
108 223 8.0 85 7 25
20 81 8.6 82 7 26
52 82 12.0 86 7 27
82 213 7.4 88 7 28
50 275 7.4 86 7 29
64 253 7.4 83 7 30
59 254 9.2 81 7 31
39 83 6.9 81 8 1
9 24 13.8 81 8 2
16 77 7.4 82 8 3
78 NA 6.9 86 8 4
35 NA 7.4 85 8 5
66 NA 4.6 87 8 6
122 255 4.0 89 8 7
89 229 10.3 90 8 8
110 207 8.0 90 8 9
NA 222 8.6 92 8 10
NA 137 11.5 86 8 11
44 192 11.5 86 8 12
28 273 11.5 82 8 13
65 157 9.7 80 8 14
NA 64 11.5 79 8 15
22 71 10.3 77 8 16
59 51 6.3 79 8 17
23 115 7.4 76 8 18
31 244 10.9 78 8 19
44 190 10.3 78 8 20
21 259 15.5 77 8 21
9 36 14.3 72 8 22
NA 255 12.6 75 8 23
45 212 9.7 79 8 24
168 238 3.4 81 8 25
73 215 8.0 86 8 26
NA 153 5.7 88 8 27
76 203 9.7 97 8 28
118 225 2.3 94 8 29
84 237 6.3 96 8 30
85 188 6.3 94 8 31
96 167 6.9 91 9 1
78 197 5.1 92 9 2
73 183 2.8 93 9 3
91 189 4.6 93 9 4
47 95 7.4 87 9 5
32 92 15.5 84 9 6
20 252 10.9 80 9 7
23 220 10.3 78 9 8
21 230 10.9 75 9 9
24 259 9.7 73 9 10
44 236 14.9 81 9 11
21 259 15.5 76 9 12
28 238 6.3 77 9 13
9 24 10.9 71 9 14
13 112 11.5 71 9 15
46 237 6.9 78 9 16
18 224 13.8 67 9 17
13 27 10.3 76 9 18
24 238 10.3 68 9 19
16 201 8.0 82 9 20
13 238 12.6 64 9 21
23 14 9.2 71 9 22
36 139 10.3 81 9 23
7 49 10.3 69 9 24
14 20 16.6 63 9 25
30 193 6.9 70 9 26
NA 145 13.2 77 9 27
14 191 14.3 75 9 28
18 131 8.0 76 9 29
20 223 11.5 68 9 30

DT::kable() tables

require(DT)
data(airquality)
datatable(airquality, caption = "New York Air Quality Measurements")

plotly

library(plotly)

set.seed(100)
d <- diamonds[sample(nrow(diamonds), 1000), ]

p <- ggplot(data = d, aes(x = carat, y = price)) +
  geom_point(aes(text = paste("Clarity:", clarity)), size = 1) +
  geom_smooth(aes(colour = cut, fill = cut)) + facet_wrap(~ cut)

ggplotly(p)

shiny

outputs

demo

Exercise

your mission

create your first .Rmd!

  • get data from OSF
  • show us some data in a table
  • plot some data
  • fit a model
  • write a bit about what you did

see my example: